Date: Mon, 3 Feb 92 15:09:01 -0800 From: Jeff (Grambo) Graham STRAND SOFTWARE TECHNOLOGIES, INC. STRAND88 TECHNICAL DESCRIPTION Buckingham Release July 1990 SCOPE OF THIS DOCUMENT This document provides a short technical description of STRAND88. Readers who are unfamiliar with the ideas of concurrency should read the companion document STRAND88 Technical Overview first This document is not intended to be a management level description. STRAND AND STRAND88 The two terms Strand and STRAND88 have precise and different meanings. Strand is a language definition that is, by and large, in the public domain, such a definition covers only the language, its syntax and semantics but does not cover the environment or libraries. STRAND88 is a product, currently the only available implementation of Strand, which includes an abstract machine emulator, compiler, lint, environment and associated services and libraries. STRAND88 has had a number of releases, the current release is called Buckingham, and is the second major release, the previous release was called Admiralty. Within a major release, minor updates occur, for example to support new hardware, these are referred to by build number. The initial Buckingham release has build number B.6. Very little in this document has been changed to reflect the upgrade from Admiralty to Buckingham. Before describing the features of Strand it Is important to discard many of the conventional components of sequential languages. In a Strand system there are no stacks and hence no stack frames, variables behave quite unlike variables in C or Pascal, the concept of flow of control does not exist, and within some constraints, the semantics of a program are independent of the order that it is written. STRAND FEATURES Strand is a language that has parallel semantics, by this we mean that statements in Strand are executed concurrently, not in the order they occur in a procedure definition. Each statement, more accurately each procedure call, is executed in a separate process, on a particular processor. A process in this context is a far cry from a large scale process. Strand processes differ from other forms of process in the following ways. Firstly they are small, consuming only a few words of memory, secondly they only ever perform one operation, that is to spawn zero or more new processes, once this is performed the process is complete and can be recycled. Lastly they may have to wait for the correct set of inputs before they can execute STRAND PROCESSES A process has a set of arguments each one of which may be a data value (simple or structured) or a variable. Data structures may be referenced by more than one process, or one process argument may be a part of another data structure referenced in some other process. Variables may also be referenced by more than one process argument, that is they may be shared. A running Strand program therefore consists of a network of processes, connected by shared data structures and variables. There are no global structures. An alternative, and equally valid view, is that of a set of data structures being constructed in core by a set of processes, each process waiting to add its part of the edifice One can represent a process as a term of the form: mod1:start(A1,A2,...Ak) That is it has a name, start, is defined in a module, mod1, and arguments A1 to Ak. We say that the number of arguments, k, is the arity of the process. Normally a process name is written in text with the arity appended, for example start/4, if start has an arity of 4. The reason for this is that Strand procedures with the same name but different arities are semantically distinct. When a process executes, we use the term reduces, it may spawn new processes, some of which may be inbuilt procedures called kernels. Kernels are the primitive operations of Strand and perform various data manipulation activities. VARIABLES Variables in Strand are single assignment, and as mentioned in the previous paragraph may be shared, i.e. part of more than one process. Variables can be assigned a value. This replaces the variable with the value assigned, it is not possible to update this replacement once assignment has occurred Assignment is only performed by kernels, the most common being the assignment kernel itself. If an attempt is made to assign to a non-variable then a system error is generated. The effect of an assignment is that all process arguments that previously referenced a variable now reference the value assigned. SHARED VARIABLES Processes that share a variable may either be readers or writers, i.e. they may be waiting to use the value, or to write it. It is clear from the behavior of variable assignment that only one writer should ever exist for each variable and that multiple writers will lead to a system error. We are describing the lowest level semantics of Strand here, multiple writers are obviously an important programming technique, and can be achieved by higher level techniques. MULTIPROCESSORS Since the value assigned to a variable can not be changed it is always unambiguous to make a copy of the value on another processor. Therefore the situation of single writer and multiple readers where the readers are on different processors is exactly the same as if they were on the same processor. When an assignment is made, copies of the assigned value is communicated to the other processors. It merely takes a little longer to pass through the communications subsystem. To make one reader reduce on another node that reader process needs to be spawned on the desired node No other changes are necessary. REDUCTION EXAMPLE Suppose we had a process, called start, that spawned three processes, one writer, rite, and two readers, reeda and reedb, that share a variable 'X', then we can imagine the following series of events unfolding as execution proceeds (refer to diagram) ------------------------------------------------------------- | Time 0 start() | ------------------------------------------------------------- reduce start/0 ------------------------------------------------------------- | Time 1 rite(X) -----------\ | | reeda(X) ------------> Variable: X | | reedb(X) -----------/ | ------------------------------------------------------------- reduce rite/1 ------------------------------------------------------------- | Time 2 X := 42; -----------\ | | reeda(X) ------------> Variable: X | | reedb(X) -----------/ | ------------------------------------------------------------- assign 42 to X ------------------------------------------------------------- | Time 3 reeda(X) ------------> 42 | | reedb(X) -----------/ | ------------------------------------------------------------- reduce reeda/1 and reedb/1 ------------------------------------------------------------- | Time 4 | ------------------------------------------------------------- At time 0 a single process, start(), exists. Reducing this results in three processes, rite/1, reeda/1 and reedb/1. One process is then selected for reduction, we will assume it is rite/1, reducing this results in an assignment kernel being spawned [In the actual implementation kernels are normally executed 'in-line' for efficiency, the description here is a conceptual model] At time step 2 the assignment kernel, here shown as the assignment operator ':=' assigns 42 to the variable X. The two readers can now be reduced and read the value 42. Note that the reader processes can not reduce until the value is available. A process that is in this state is said to be suspended, in this case suspended on the variable X. Suspended processes wait quietly for the variable or variables they are waiting for to be assigned. The reduction of a process is described by a procedure which is comprised of a set of clauses. Clauses have the following general form: H :- G1, G2,...Gm|B1, B2,...Bn m,n >= 0 Where H is the clause head, ':-' is known by convention as the implication operator, the Gs constitute the clause's guard, '|' is the commit operator and the Bs are the clause's body. The clause definition is terminated by a period. Clauses define basic process actions in the following way. The head and guard together specify a set of preconditions under which a process may be reduced. A clause of the form H :- G1, G2,...Gn|B1. specifies that if its preconditions are met then the process changes to a new state as specified by B1. A general clause of the form H :- G1, G2,...Gm|B1, B2,...Bn specifies that if the preconditions are met then the process changes to a state as specified by one of the body calls and simultaneously spawns or forks processes as specified by the other body calls. CLAUSE STRUCTURE, PROCEDURE DEFINITIONS AND MODULES The clause head describes a state which is used to match against the state of a process. it has the form H(T1, T2,...Tn) n>=0 The head consists of the name H (a string beginning with a lower case character) of the procedure to which the clause belongs and the argument list T1,T2,...Tn. The number of elements in the formal argument list is the arity. The clause guard contains a sequence of guard kernel calls. A guard kernel call invokes a predefined test operation known as a guard kernel which is applied to the state of a given process. The language provides a set of guard kernels. It is also possible for the programmer to define user guard kernels using the foreign language interface . A guard kernel call has the form G(T1, T2,...Tn) n>=0 A guard kernel call consists of the guard kernel name G (a string beginning with a lower case character) and an argument list. The clause body contains a collection of body calls and assignments (see below). A body call has the form B(T1, T2,...Tn) n>=0 Where B is the name of the procedure and n its arity and T1, T2,....Tn is its argument list. A body call can be one of: - a [body] kernel call. This invokes functionality provided by Strand the language - a procedure call [to a procedure defined in the same module (see below)] - a system call [to the environment of the form C!] - an external procedure call [of the form M:C] (an external procedure call contains two components - a module M and a call C to a procedure in the module) - an addressed call [of the form C@N] (an addressed call contains two components - an address N and an indirect call C. The latter can be a system call or an external procedure call). A procedure definition is formed by grouping together clauses the heads of which have the same name and arity. The code of a Strand program is composed of one or more modules. A module is a collection of procedures, the definition of compilation mode and an exports list. The exports list is a specification of the procedures contained in the module which may be referenced by procedures from other modules. Generally, the structure of a Strand module can be viewed as: specification of compilation mode specification of exports list procedure1 procedure2 ... The name of a module is determined by that of the file (with a .sam extension) from which it is loaded, and therefore must conform to the host operating system rules. The .sam suffix of compiled module files is assumed, and need not be specified. Modules are developed using an external editor to write Strand source code files which conform to the structure described above. The source file must be compiled by the Strand compiler before the procedures they contain can be used. Strand source code files have the suffix .std. Compiled module files have the suffix .sam. A module, once loaded, is a first class Strand data type and can be manipulated by Strand programs in the same way as data types which can be represented directly in Strand source code. DATA STRUCTURES IN STRAND Tuples Tuples are denoted by a collection of data items enclosed in braces i.e.{} and separated by commas. The arity of a tuple is the number of elements it contains. For example, the following tuple may represent a building: {house, 77,{bedrooms, 3}, Owner} In this case the arity is 4 and the elements are: the string house, the number 77, the tuple {bedrooms, 3} and the variable Owner. Lists Lists are used to represent sequences of data items. A list is denoted by: [Head|Tail] where both Head and Tail are terms. The head of a list represents the first data item in the sequence; the tail represents the rest of the sequence. An empty list (i.e. one that has no head or tail) is denoted by:[]. For example [house, 77, {bedrooms, 3}, Owner] denotes a list which contains the same data as the example tuple shown above. It must be understood that the tuple shown above and the list described here are completely different data structures even though they contain the same data items. It is often valuable to be able to denote the continuation of a sequence without mentioning its elements explicitly, e.g. [house, 77 | Rest] AN EXAMPLE STRAND PROGRAM This section provides a complete Strand program which allows illustration of most of the terminology introduced previously. % This module 'member' implements the procedure member/3 which is % used to determine whether a data item is present in a list. % Its format is - member(Item?, List?, Result^) - where Item and % List are input arguments and Result is an output argument which % is assigned the value 'true' or 'false' depending on whether Item % is present in List. % If List is not a list, Result is the value [] and an error is % signalled. -compile(free).% compilation mode -exports([member/3]). % exports list member(_,[],Result) :- Result := false % C1 Item not found in % List. member(X,[X|_],Result) :- Result := true % C2 Item is first % element in List. member(X,[Y| Rest],Result) :- % C2 Item not first X =\= Y| % element in List % so search rest of list. member(X, Rest, Result). member(_, IncorrectInput, Result) :- % C4 for debugging. Trap otherwise| % cases where second arg Result := [], % is not a list. Send error error('arg not a list ' % message and set Result to ,IncorrectInput)!. % empty list to allow other % process waiting for % Result to react. Notice the following points: 1. C1 and C2 and C3 all use the anonymous variable as a formal variable to indicate they do not care about the value of a formal argument. 2. C3 uses the formal variable Rest as an actual argument in the recursive procedure call to member. 3. C3 makes a guard kernel call to the guard kernel '=\=' and C4 makes a guard kernel call to the guard kernel 'otherwise'. 4. C4 makes a system call to notify the occurrence of an error. 5. C1, C2 and C3 all use a literal structure as a formal argument in each case this structure is used to match against a process argument to determine whether the clause can be used to reduce the process (matching is covered in detail in the next section). 6. C1 and C2 both use the assignment operator ':=' to assign a literal value to a formal variable. (Assignment is covered in more detail in the next section.) STRAND PROGRAM EXECUTION A Strand computation corresponds to a pool of concurrently executing processes Recall that each process can be represented by a term of the form: p(T1, T2,...Tn) (n>=0) Where p/n identifies a procedure used to execute the process by its name and arity and T1,...., Tn are the process arguments. The arguments are data structures that comprise the state of the process. Computation proceeds by repeatedly removing a process from the pool and attempting to reduce it. Notice that this description does not preclude the possibility of simultaneously attempting to reduce more than one process. A reduction attempt involves finding a clause from the associated procedure definition whose head matches the process state and the execution of the clause's guard. If the preconditions specified by a given clause head and guard are satisfied then the process commits to that particular clause. There may be more than one clause whose head matches the process state, in this instance the first one matching would be selected. It is important that the programmer realize this since it imposes a requirement for an appropriate degree of specificity in the definition of the preconditions of clauses. It also serves to explain the need for guards in addition to the matching process. Once a process commits to a clause it is reduced. Recall that reduction involves only three kinds of action: termination, changing state and forking. MATCHING Strand employs a matching algorithm to compare a clause head with a process state. A process state matches with a clause head if the name of the procedure associated with the process is the same as the name of the clause head and the number of process arguments is the same as the number of the clause's formal arguments and the process arguments match with the formal arguments. Strings match if the two strings are identical - they contain exactly the same sequence of characters. Numbers only match if they are identical. In addition integers do not match against their real counterparts. It is normally dubious practice to match two real numbers. Variables match against variables. Structures match if they are of the same size and type and the elements they contain match. Thus a tuple can only match against a tuple of the same arity and a list can only match against a list as long as the specified elements match. Although the matching algorithm may bind variables in the clause head, it may not assign values to variables in the process state. An important concept for Strand programming is data flow synchronization where matching is suspended until sufficient data is available to determine the outcome of a match. In essence, it is the availability of data that determines when processes are able to execute GUARD EXECUTION We have previously mentioned guard kernels and guard kernel calls. A guard kernel call appears in the guard of a clause and invokes the test operation specified by the guard kernel. These tests are built into Strand and can be used to perform type checking, arithmetic comparison and term comparison. Like matching they serve to constrain when a clause can be used to reduce a process. Any number of guard kernel calls may appear in the guard of a clause. They are executed from left to right textually after matching is complete and has succeeded. Each test may succeed, fail or suspend. If all tests In a guard succeed then the guard as a whole is said to succeed; if any test fails or suspends then the guard as a whole immediately fails or suspends respectively. Tests generally suspend if they encounter a variable. The guard tests fall into five basic categories. These are: data type tests comparisons assignment tests (does the variable have a value or not) failure of match and guard tests in all other clauses system state tests. ASSIGNMENT Assignment is used to generate a value for a variable that appears in a process argument or to generate a value for a local variable. Remember a variable matches against a variable. Assignment is represented in program text as follows: Variable := Value Assignment calls can only appear in the body of a clause. One way to visualize the assignment operation is to think of a variable as a box. The box has a label which corresponds to its name and it can hold any data structure. The effect of executing an assignment is to place a value in this box; any part of a program which refers to the variable via its label can access its value. Although this operation is simple there are some subtler points that must be appreciated. Once a value has been assigned to a variable it cannot be removed or changed. Variables which have this property are often called single assignment variables. Since it is possible to assign anything to a variable it must be possible to assign a variable to a variable. For example consider the assignment Thing1 := Thing2 After this assignment, any part of the program that refers to thing1 automatically has access to the 'box' associated with Thing2. Another way to think of this is that Thing1 becomes an alias (i.e. another name) for Thing2. Finally, a 'box' cannot be placed inside itself; indeed anything containing a box cannot be placed inside that 'box'. Thus the assignments Foo := Foo and Foo := [Baz, Bar, Foo] are defined to be illegal. Another way of putting this is that circular references and circular terms cannot be manipulated in Strand. BODY CALLS The distinction between a body kernel call and a procedure call is simply that body kernel calls invoke functionality specified by the language, whereas procedure calls refer to procedures defined by the programmer. The exact details of what takes place given a body kernel call is implementation specific. However, in order to retain the simplicity of the model presented so far it is sufficient to say that the system attempts to 'reduce' the 'process' associated with a body kernel immediately without placing it in the process pool. It is only if the 'reduction' attempt suspends that a process is placed in the process pool. Now let us look in more detail at procedure calls. Recall that once a process has committed to a particular clause there are only three kinds of action that can be taken to reduce the process. We have already seen how the general form of the body is related to each of these actions. All we need to do now is describe in slightly more detail how a procedure call is related to a change of state in a process and the spawning of further processes. One of the procedure calls in the body causes the process to change state if we recall that a process can be represented by a term of the form p(T1, T2,... Tn) n>=0 and a procedure call has the general form b(T1, T2,... Tn) n>=0 It is easy to see that the process is transformed so that it is associated with the procedure definition that has the same name and arity as the procedure call and the process arguments now become those specified by the actual arguments of the procedure call. The data structures that become the process arguments are comprised of the literal values that appeared in the call plus the values that any variables might have received (which might themselves be variables) through matching or assignment and any local variables introduced in the clause that appear in the call. When the process has been thus transformed it is placed back in the process pool to be further reduced. STRAND88 THE PRODUCT STRAND88 as a product consists of more than just a strand compiler and abstract machine emulator, it also provides an interactive shell and an environment to aid software development. The Strand Abstract Machine emulator, or SAM, is the part of the system that performs the low level emulation. This is designed to run on a variety of machines and enables compiled Strand code to be run on any SAM implementation. When strand is started by typing 'st' to the operating system prompt, it is the SAM that is invoked. The services provide functions such as intermodule calling, I/O, and monitoring computations. The shell provides an user interface and starts applications, compilations and other utility programs. EXAMPLE SHELL SESSION Strand is started by typing the following command in response to the OS prompt: st The first thing you see when Strand starts running is something like this: Welcome to STRAND88(TM). ... Copyright (c) Artificial Intelligence Limited ... 1 node invoked. 1> The 1> is the STRAND88 Shell service prompt. Suppose that the file demo1.std has the following contents: -compile(free). -exports([min/3]). min(X,Y,Z):- X < Y | Z := X. min(X,Y,Z):- X >= Y | Z := Y. min(X,Y,Z):- otherwise | Z := error("illegal values",X,Y). Lets start by calling the min procedure that is defined in the demo1 module. We must tell Strand where the definition of the call can be found, the call we want to use and any arguments. Normally it is necessary to compile a module before it can be called, but we can assume that demo1 has been compiled already. 1>demo1:min(23,45,Z) Strand responds with: [started,1] [exited,1] 2> Strand has started the correct definition of the min call to use, reduced it to its kernel components and executed them. The started and exited text shows you that the call typed on input line 1 has been started and exited The 2> is Strand's prompt for another input, this time on the next line. For simplicity, the rest of this document ignores strand's response of started and exited. The minimum of 23 and 45 has been assigned to the variable Z but we have not seen the result yet. This variable is remembered by the shell, and whenever the shell sees Z in the future it will substitute the appropriate value. There are a number of ways in which we can examine the value of such shell variables. The easiest mechanism is to use the watch facility. This requests that the value assigned to a variable be displayed when that value is assigned. In other words the result of the watch request may not be seen for a long time, if the assignment is not performed for a while. If a value has already been assigned then it is printed immediately the request is made. A watch request is made by using the postfix operator '^'. 3>Z^ Z == 23 4> Shell variables can also be assigned directly: 4>X := 123 5> Now we can call min again, this time using the new shell variable X, and requesting a watch on a new variable Z2: 7>demo1:min(34,12.3,Z2^) Z2 == 12.3 8> The demo2 module contains definitions for a procedure that recursively generates a list of integers. The module looks like this: -compile(free). -exports([gen/2]). gen(0,X) :- X := []. gen(N,X) :- N > 0 | X := [N|X1], N1 is N - 1, gen(N1,X1). gen(N,X) :- otherwise | X := error("not an integer",N). This module contains the procedure that defines the gen procedure, which has three clauses. The first clause simply assigns the variable X to an empty list if the first parameter is zero. The second clause is more complicated. This definition may be used when the first clause cannot be used. This condition is checked with the "N>0" guard. When Strand uses this definition, it makes X a list whose first element is the value of N and whose second element is an unassigned variable called X1. Another new variable, N1, is created which is one less than the value of N. The definition concludes by calling gen again, this time with the new parameters. By repeated recursion the list is extended until N is zero, when the first clause is used to terminate the list correctly. The third clause is invoked if the first parameter is not an integer, and assigns the output parameter to an error structure. To generate a list of 10 integers, and see the result, we can type in the following: 9>demo2:gen(10,I^) [I/1, 10] [I/2, 9] [I/3, 8] [I/4, 7] [I/5, 6] [I/6, 5] [I/7, 4] [I/8, 3] [I/9, 2] [I/10, 1] [I/tail,[],'at index', 10] 10> Procedures such as gen/2 that create a list, or stream, are referred to as producers, and normally are coupled with procedures that accept a list as input, or consumers. The concept of a stream is just a metaphor for a list that is not complete. A simple consumer is sum: -compile(free). -exports([sum/2]). sum(L,R):- sum(L,0,R). sum([],Accum,R):- R := Accum. sum([H|T],Accum,R):- integer(H) | Accum1 is Accum + H, sum(T,Accum1,R). sum(L,_,R):- otherwise | R := error("invalid value",L). The first feature to note is that this file contains two different procedures. Both have the same name, but different numbers of arguments (arity). The two procedures are sum/2 and sum/3, these are totally distinct as far as the Strand system is concerned. In practice, of course, it might be better to use different names, but that would miss the point. This example is similar to the previous example but uses a list construct in the head of sum/3 to decompose the list. Try calling sum on a small list: 10>demo3:sum([3,4,5],R1^) R1 == 12 11> We can tie together the producer and consumer by linking the two procedures. A module that does this look like: -compile(free). -exports([go/1]). go(N):- integer(N) | demo2:gen(N,L), demo3:sum(L,R), output(R). go(N):- otherwise | display_n1("Error: parameter should be an integer")!. output(R):- data(R) | display("The total is: ")& display_n1(R)!. FOREIGN LANGUAGE INTERFACE Strand's foreign language Interface provides a method for including new guard and body kernels within the Strand SAM, thus extending the set of available primitive calls. These "foreign language" kernels can be called from Strand programs compiled and run on the extended SAM, independent of the presence or absence of the Strand environment. Foreign language code (i.e. non Strand definitions) must be supplied to implement the kernel functionality. This code is called when kernels are executed, either during guard testing or activation of a clause body. The foreign code has access to the kernel arguments and may also build new Strand data for return as output. A number of foreign kernels are supplied with Strand (see the Strand Reference Manual). The foreign language interface also enables users to extend the range of Strand datatypes available in the SAM. User datatypes can be used to store foreign language data structures for use by foreign code. Alternatively, they can be used to optimize access to and manipulation of data which could also be encoded using Strand structures such as tuples or lists. The foreign language interface provides one more useful function. It enables external interrupt signals to be caught and handled without endangering the correct functioning of the SAM. The interrupt functionality is integrated with the kernel interface. It provides for enabling/disabling of interrupt signals, notification of interrupt signals to the SAM, registration of interrupt handlers to be called by the SAM and a stream interface between the handler and a Strand 'interrupt consumer' process. A library definition file is a text file containing declarations of foreign kernels in a standard format. A declaration may spread over more than one line, but each new entry must start after a new line. It may also contain library file references, declarations to include kernel declarations from other library definition files. When malis (the program that generates the foreign kernel code) is called to build the interface code it must be provided with the name of a single libdef file as argument. It reads the file line by line recording each kernel declaration. When it reads a library file reference it inserts lines from the library definition file referred to in the declaration in place of the reference and continues processing. Once all include and kernel declarations have been read and noted malis generates the interface code used to invoke the foreign kernels declared in all the files it has read. N.B. malis will follow up nested library file references i.e. a library file included by a reference may contain other references which malis will look up and include. A foreign kernel declaration has the following format: Number Type Strand_name Foreign_name Mode Language The Mode is a list of argument mode specifiers, one per kernel argument, where a mode specifier is a symbol describing the argument. Up to 32 arguments may be defined for a foreign goal, although the kernel may have none. The argument mode specifiers are written inside round brackets, separated by commas. Each argument mode specifier indicates whether the argument specified in the foreign kernel is an input or output argument. A question mark (?) denotes an input argument, a caret (^) denotes an output argument; a hyphen (-) denotes a raw argument i.e. one that Strand should pass directly to the foreign code. The question marks, carets and hyphens are separated by commas. Additionally the types of the arguments may be specified, in which case the code to convert the types from Strand to or from C or Fortran is generated automatically. If the foreign procedure returns a result then this may be specified by using a colon after the other parameters have been defined. For example: 5103 guard s_cos cos (?) C defines a kernel with one input argument 5104 guard s_tan tan(?,?) c defines a kernel with two input arguments 5105 body s_sqrt sqrt(Real?):Real^ c defines a kernel with its first argument an real input and it second an output real 5106 guard s_sin sin(-) c defines a kernel with one raw argument FOREIGN LANGUAGE INTERFACE EXAMPLE Here is an example of the Foreign Language Interface used to write 6 kernels to manipulate a database. Firstly, the 6 'foreign' code procedures: OPEN Void open_db(name,dbu,res) REF name,dbu,res Function - Opens database file identified by name. Returns - Address of the database file in dbu and successful completion or returns failure message. CLOSE Void close_db(dbu,res) Ref dbu,res Function - Closes database file identified by dbu Returns - Success or failure DATABASE Void is_db(dbu) Ref dbu Function - Checks whether the location identified by dbu is a database. GET RECORD Void get_rec(dbu,key,result) REF dbu,key,result Function - Searches in the database identified by dbu, for record identified by key. Returns - Record identified by key or failure if not found. PUT RECORD Void put_rec(dbu,key,contents,result) REF dbu,key,contents,result Function - Puts contents in location identified by key in the database identified by dbu. Returns - Successful put or an error. DELETE RECORD void delete_rec(dbu,key,result) REF dbu,key,result Function - Deletes a record identified by key in database identified by dbu. Returns - Successful deletion or error. The corresponding library definition file entries for these 6 procedures are: 201 body db_open open_db(?,^,^) c 202 body db_close close_db(?,^) c 203 guard database is_db(?) c 204 body db_get get_db(?,?,^) c 205 body db_put put_db(?,?,?,^) c 206 body db_delete delete_db(?,?,^) c >From this library definition file, the program MALIS will produce the 6 Strand user kernels which will allow access to the 6 procedures from within the Strand program. Typical use of these kernels from within Strand would be: db(Name,Request) open(Name,Id,Result), % open requested database dbm(Id,Request,Result). % send requests dbm(Id,Request,Result) :- data(Result) | % check ready to send dbm1(Id,Request). % next request dbm1(Id,[get(Key,Result) | Request]) :- % get request get(Id,Key,Result), dbm(Id,Request,Result). dbm1(Id,[put(Key,Rec,Result) | Request]) :- % put request put(Id,Key,Rec,Result), dbm(Id,Request,Result). dbm1(Id,[delete(Key,Result) | Request]) :- % delete request delete(Id,Key,Result), dbm(Id,Request,Result). dbm1(Id,[]) :- % close request close(Id,Result), dbm(Id,Request,Result) open(Name,Id,Result):- database(Id) | db_open(Name,Id,Result). open(Name,Id,Result):- otherwise | Fail(Result). % error put(Id,Key,Record,Result):- database(Id) | db_put(Id,Key,Record,Result). put(Id,Key,Record,Result):- otherwise | Fail(Result). % error get(Id,Key,Result):- database(Id) | db_get(Id,Key,Result). get(Id,Key,Result):- otherwise | Fail(Result). % error delete(Id,Key,Result):- database(Id) | db_delete(Id,Key,Result). delete(Id,Key,Result):- otherwise | Fail(Result). % error close(Id,Result):- database(Id) | db_close(Id,Result). close(Id,Result):- otherwise | Fail(Result). % error PERPETUAL PROCESSES AND DATA FLOW Strand supports recursion and has recursive data structures. The basic idea of writing recursive algorithms is to express an algorithm in terms of itself with a set of stopping conditions. In Strand there is only one way to express iteration and that is through writing recursive or perpetual processes. These processes typically have the form: my_process(....) :- G1,G2,...Gn | B1,B2,my_process(....). . . . my_process(...). In this form there are a number of clauses which cause a new instance of "my_process" to be spawned if they are used to reduce this process. Thus the process is defined in terms of itself. Notice there is at least one termination clause which represents a stopping condition. STREAMS In Strand processes communicate through the use of shared variables, consider the following process pool: producer(C), consumer(C) These both share a common communication channel represented by the shared variable C. If this variable is a list and the tail of the list is always an unbound variable, then data can flow from producer to consumer through this data structure. An incomplete data structure of this type is known as a Stream. PROGRAMMING CLICHES Programming cliches exist in all languages and are the building blocks from which all applications are designed. In Strand there are 6 cliches: 1. Producer/ Consumer 2. Incomplete messages 3. Bounded Buffers 4. Difference Lists 5. Short Circuits 6. Blackboards PROGRAMMING IN STRAND Strand is a concurrent programming language that has parallel semantics (see portability). This does not mean however that all code written in Strand will run in parallel automatically. If processes are to run in parallel they must be coded in such a way that this is possible and correct (i.e. required data available, the correct communication data is set up). Conversely, sequential algorithms can be successfully coded in Strand if desirable. This can be achieved by synchronization through shared data (remembering that Strand is a single assignment language and a process will suspend until data is available). But the affect of distributing sequential code over several nodes will be no different than executing on a single node machine. PORTABILITY Strand has parallel semantics which means that the same code can be distributed in many different ways with no need to adapt the functionality . Distribution is achieved by annotation which does not affect the correctness of the program and can be tailored to achieve maximum performance results. Alternatively, using a standard set of annotations will give adequate performance, but complete code invariance across many configurations. In terms of user applications, providing STRAND88 is available on the particular hardware platform, there should be no problems in porting from one machine to another (assuming any external code included via the Foreign Language Interface, will port to the new machine). The user may change the annotation of process to processor for multi processor machines. This is possible because in between the user application and hardware there is the SAM (Strand Abstract Machine). This is the part of the system that does the low level emulation. A SAM must exist for the particular hardware the user wants to use (each SAM is written in 'C' and will vary slightly from machine to machine). If it does then running the same Strand code on the different machines should be possible. Portable Parallel Programming with Strand Abstract Strand is a programming language for developing applications to run on a broad range of parallel computers. It is based on an inherently parallel model. Therefore, the results from a Strand program are invariant to changes in a program's parallelism. This paper will introduce the Strand language and its interface to C and Fortran. A Strand/C program that predicts characteristics of a protein's shape is described as an example of the utility of Strand multilingual programming. Introduction The spread of parallel processing throughout the advanced computing arena is restricted by a shortage of software. This scarcity of software is largely due to the fact that each parallel computer has its own native programming tools; a situation that binds an application program to a particular machine. Lack of portable software hurts the industry from both ends of the user spectrum. The supercomputing end user is reluctant to adopt parallel processing due to the costs of parallelizing their programs for each computer. The nonportability of the resulting programs makes parallel computing a high risk endeavor. Further exasperating the situation is the barrier placed before commercial software vendors who need a broad market base in order to recoup their investments. While the combined parallel computing market is large, individual computers have too few users to support independent software vendors. Therefore, with very few exceptions, third party software vendors are staying away from the parallel processing market - denying users the parallel math libraries, data base management systems and specific applications they require. Only with portable, parallel software development tools will parallel computing be able to enter the mainstream of supercomputing. Two strategies have been used to design these tools. One approach involves extending an existing sequential language with parallel processing constructs. This is the approach taken by Schedule [1] and Linda [2]. Alternatively, a new, inherently parallel processing language can be developed such as ALP [3] or Strand [4]. Selecting the preferred strategy is a subjective decision based to some degree on the applications under consideration. It is my opinion, however, that the best long range choice is a new, inherently parallel processing language. This type of language has a better chance of evolving with the state of the art in advanced computing. As automatic parallelization techniques become available, the importance of inherently parallel languages will increase since they should map better into these schemes. The only inherently parallel programming language commercially available on a broad range of computers is STRAND88. This language runs on sequential computers as well as distributed memory and shared memory parallel computers. The goal of this paper is to introduce the Strand language and how it is currently being used to develop applications. The paper begins with a brief description of the language. This is followed by a simple example to solidify the preceding language summary. Next, the details behind calling C and Fortran subprograms are given followed by a program to compute the linear convolution of two arrays. This demonstrates both the use of C functions in a Strand program and explicit parallelism in Strand. The paper concludes with a discussion of current Strand applications with an emphasis on a protein structure program developed at Caltech [5]. The Strand Language The key to any portable parallel processing language is its architecture independent view of the parallel computer. Figure one displays the Strand computational model. The central feature of this model is the pool of light weight processes representing the state of the computation. ----------------------------------------------------------------------------- Figure 1: The Strand computational model ------------- ------------------- | Strand |----------------------------> | | | Abstract | | Process Pool | | Machine | <----------------------------| | ------------- | O | ^ ^ ^ | O O | / | \ | O | ------------------- | | | module 1 | | O O | | | | O | | proc1(...):-... | | O | | proc2(...):-... | | O | | proc3(...):-... | | O O | ------------------- ------------------- ----------------------------------------------------------------------------- A Strand computation cycle begins when the Strand Abstract Machine (SAM) removes a process from the pool. The SAM reduces this process into more primitive processes using the information from the Strand program. These processes are either placed back into the pool or, if they are so primitive as to be immediately executable, they are executed. These executable processes are referred to as kernels. Strand kernels fall into two classes - user defined and language provided. Language provided kernels consist of assignment, scaler arithmetic, and other low level operations. User defined kernels are Fortran and C routines provided by the user. The interface between user defined routines and Strand is discussed in more detail in a later section. In all cases, a process suspends until all of its required data is available. Hence, Strand can be viewed as a data flow language. As mentioned earlier, Strand processes are reduced according to the rules from the Strand program. The Strand program consists of a collection of modules containing Strand procedures. Hence we must understand the syntax of a Strand procedure. A full description of Strand's syntax is available in the textbook Strand: New Concepts in Parallel Programming [4]. Only a summary is given here. A Strand procedure consists of a collection of clauses with the same name. The form of a clause is: H :- G1, G2, ... GM | B1, B2, ... BP. H is the head of the clause. It contains the clause's name and the procedure arguments. G1 through GM are the optional guard kernels which provide tests of the procedure arguments. Finally, if the head matches the process being reduced, the guards all succeed, and all of the required data is available, then the optional body calls, B1 through BP, are activated. The body calls can be other Strand procedures or body kernels. Strand variables are single assignment and are indicated by an initial uppercase letter. Processes share variables to communicate with each other. Strand's simple data types are integer, real and string. These are self explanatory. The compound types are the list and the tuple. The list is defined as a collection of comma separated items: [1, 4, 2.0, "a string", Variable_1] Lists elements are accessed using a head - tail notation: [H1 | Tail] where H1 refers to the first element of the list and Tail to the rest of the list. A tuple is a more general form of a list. While a list is processed sequentially, the elements of a tuple are accessed randomly. Tuples are in some ways simpler than lists but will not be considered further in this paper. The computational model underlying Strand is inherently parallel. STRAND88, however, does not automatically distribute processes across the nodes of a parallel computer. The programmer must explicitly assign a process to a node. For example, the body call to invoke the process, conv, on node N, is: conv(Argument_list)@N, The node address can be a variable, a literal or in some cases a virtual address such as next or previous. The syntax of interprocess communication does not depend on whether the processes reside on the same node or not. Hence, the result of a Strand program does not change as the explicit parallelism is modified. This discussion of Strand's syntax will be much clearer after considering the following example. Example: List Membership Consider the procedure member. The procedure's three clauses define the search of an input list for the indicated KEY. If it finds KEY the variable Status is set to the string "ok" . Otherwise, Status is set to "fail". The Strand Abstract Machine, SAM, would select the member procedure to reduce a process of that name. The head of the procedure's first clause will successfully match to a process with a non empty list for its second argument. Note that the input list is split into its head and tail during the matching. If the guard succeeds, Status is set to ok and the procedure terminates. If the head matches but the guard from the first clause fails, the second clause is attempted. If this guard succeeds, the member procedure is recursively invoked on the tail of the list. Finally, if the second argument of the process being reduced is an empty list, only the head of the third clause can successfully match. This indicates that the key was never found in the list and Status is set to the string fail. ----------------------------------------------------------------------------- Figure 2: The list membership procedure member(Key, [Head | Tail], Status) :- Key == Head | Status := ok. member(Key, [Head | Tail], Status) :- Key =/= Head | member(Key, Tail, Status). member(Key, [], Status) :- Status := fail. ----------------------------------------------------------------------------- This discussion implied that the Strand Abstract Machine considered the three clauses in sequential order. This is not required by the language. The programmer just provides for all cases that can arise during process reduction and does not have to worry about the order of the clauses. The Foreign Function Interface STRAND88 is a complete language providing all the constructs needed to develop interesting applications. It would be of limited practical value, however, if it could not call routines written in C and Fortran. Many parallel algorithms contain large sequential segments. Hence, an effective programming strategy is to embed sequential program fragments in the parallel Strand program. This is easy to express with Strand as any body or guard kernel can be replaced by a call to a C or Fortran routine. To call a Fortran or C routine from Strand, the programmer must build a SAM executable image that includes the foreign code. This is done using information from a user created library definition file. This file defines the mapping between Strand and the foreign code including procedure names and argument types. Once the new SAM has been built, foreign routines can be used just like any other Strand guard or body kernel. STRAND88 includes a library of macros and functions to manipulate Strand data types within Fortran and C routines. One can also create user defined, foreign data types that can be passed between processes through the shared variable mechanism. The following example demonstrates the use of the foreign function interface with a parallel algorithm. Example: Parallel Linear Convolution Linear convolution [6] is a fundamental algorithm of digital signal processing. This section of the paper presents a procedure that returns the linear convolution of two equal length arrays. If the two input arrays, X and H, are of length NX, then the output array, Y, is of length (2*NX-1). The algorithm is easily parallelized since each value in the convolution is computed independently. A C function was written that computes a block of convolution values starting at a given index. Therefore, each node is given a different block of the convolution to perform. All nodes use the same C function. The function declaration is: void conv(ind, size, n, nh, h, nx, x, ny, y) int ind, size, n, nx, ny, *ny; double h[], x[], *y[]; where the output values, ny and y, are pointers to the appropriate type. This function will return in y the next size results of the convolution starting with y [ind] . The Strand procedure call that maps to this function is: conv(Ind, Size, N, H, X, Y) where H, X and Y are lists. Notice that each Strand list maps into two values - an integer representing the length of the array and a double array. Figure 3 presents the code for a procedure called convolution. This procedure will fill the array Y with the convolution of the two input arrays, X and H. An array in Strand is represented as a list of real values (though the experienced reader may notice that Y is a list of lists - one for each block of the convolution). The input to the procedure is the number of elements to use from X and H (NX), the requested number of nodes (Nodes), and the lists representing the input arrays (X and H) . The first clause of the procedure computes the block size for the convolution and the actual number of nodes to use. Notice that // is the modulus operator. When all of the data is ready, the procedure that does the convolution (conv1) is activated. The conv1 procedure consists of two clauses. The first clause handles all but the last block of the convolution and recursively calls itself to initiate computation for subsequent blocks. These are the fixed blocks of size Size. The second conv1 clause handles the last block which may have a different size. Both clauses invoke a Strand procedure, worker, that calls the C function. This example contains a great deal of information. To appreciate its operation, remember that the procedures all run concurrently and, except for data flow synchronization, independently. The node addresses wrap around for cases where there are fewer nodes than requested by the algorithm. ----------------------------------------------------------------------------- Figure 3: Parallel convolution procedure. convolution(Nodes, NX, X, H, Y) :- NY is 2 * NX - 1, ModNY is NY // Nodes, comp_bsize(Nodes, NY, ModNY, Size, NodesToUse), conv1(NodesToUse, 0, NY, Size, NX, X, H, Y). conv1( Node, Index, NY, Size, NX, X, H, Y) :- Node > 0 | NewInd is Index + Size, NewNode is Node - 1, worker(Size, Index, NX, X, H, YHere)@Node, Y := [YHere | YRest], conv1(NewNode, NewInd, NY, Size, NX, X, H, YRest). conv1(0, Index, NY, Size, NX, X, H, Y) :- Length is NY - Index, worker(Length, Index, NX, X, H, Y)@0. worker(Length, Index, NY, X, H, Y) :- conv(Index, Length, NY, H, X, Y). comp_bsize(Nodes, NY, O, Size, NodesToUse) :- NodesToUse := Nodes, Size is NY / Nodes. comp_bsize(Nodes, NY, ModNY, Size, NodesToUse) :- ModNY > O | Size is 1 + NY / Nodes, NodesToUse is NY / Size. ----------------------------------------------------------------------------- Applications STRAND88 is being used for a number of research projects ranging from numerically intensive applications such as weather modeling [7] and protein structure determination [5] to abstract pattern matching problems such as the human Genome project [4]. This section will focus on the protein structure work. The three dimensional shape of a protein is a critical property for understanding the function of a protein. Full determination of protein shapes is a vast problem taxing the abilities of the fastest supercomputers. A valuable preliminary step is to predict where the protein folds into a particular shape known as an alpha helix. Two Caltech chemists, Joe Bryngelson and John Hopfield [5] developed a sequential C program that learns how to predict alpha Helix locations. The program learns how to predict alpha helices for new proteins by carrying out a large nonlinear optimization calculation with known protein structures. Stephen Taylor and Sam Southard [5] parallelized this sequential program using STRAND88 to express the parallel parts of the algorithm. 75% of the original C code was used in the final parallel program. Furthermore, Strand's management of the parallelism was very flexible allowing easy experimentation with different schemes for parallel execution. The program displays linear speedup with up to 32 nodes. The drop off from linearity at 64 nodes is an artifact of the protein data set divided among 64 nodes. A larger data set would allow the program to show linear speed up beyond 32 nodes. The logic required to manage parallelism inevitably carries a cost. The cost incurred by the Strand/C program is small and is overcome by the time the second node is added. Conclusion The Strand language was first offered commercially in the summer of 1989. In a relatively short time it has established itself as a powerful parallel programming language. Strand applications research is progressing at an increasing rate as the number of users grows. It is proving to be a useful language in the numerically intensive areas with current research projects in seismic signal processing [8] and computational fluid dynamics. While the parallel processing industry is still a long way from adopting programming language standards, it is hoped that STRAND88 will play a key role in filling this void. To this end, future work with Strand will focus on both improving the language and on building a large collection of applications. References [1] J. Dongarra and D. Sorenson. "SCHEDULE: Tools for developing and analyzing parallel Fortran programs" in The Characteristics of Parallel Algorithms, MIT Press, 1987. [2] W. Leler. "Linda Meets Unix", IEEE Computer, Feb. 1990, page 43. [3] P. Vishnubhotla. "Concurrency and Synchronization in the ALPS Programming Language". Ohio State University Technical Research Report, OSU-CISRC-12/89-TR56, 1989. [4] Ian Foster and Steve Taylor, Strand: New Concepts in Parallel Programming, Prentice Hall, 1990. [5] S. Southard Jr. "A Parallel Algorithm for Alpha Helix Prediction", Draft, March 12, 1990. [6] A. V. Oppenheim, R.W. Schafer. Digital Signal Processing. Prentice Hall, 1975. [7] I. Foster, R. Overbeek. "Experiences with Bilingual Parallel Programming". Presentation at the Fifth Distributed Memory Computing Conference, 1990. [8] T. G. Mattson, K. Steer. "The Seismic Signal Processing Workbench", SSTI Internal document, 1990.